On the non-uniqueness problem in integrated information theory

Abstract Integrated Information Theory (IIT) 3.0 is among the leading theories of consciousness in contemporary neuroscience. The core of the theory relies on the calculation of a scalar mathematical measure of consciousness, Φ, which is inspired by the phenomenological axioms of the theory. Here, we show that despite its widespread application, Φ is not a well-defined mathematical concept in the sense that the value it specifies is non-unique. To demonstrate this, we introduce an algorithm that calculates all possible Φ values for a given system in strict accordance with the mathematical definition from the theory. We show that, to date, all published Φ values under consideration are selected arbitrarily from a multitude of equally valid alternatives. Crucially, both \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$\Phi=0$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$\Phi\gt0$\end{document} are often predicted simultaneously, rendering any interpretation of these systems as conscious or not, non-decidable in the current formulation of IIT.


Introduction
Integrated Information Theory (IIT) is a leading contender among contemporary theories of consciousness 1 . What sets it apart from other contemporary theories such as Global Neuronal Workspace Theory Dehaene and Changeux (2004) or Predictive Processing Dolkega and Dewhurst (2020); Wilkinson et al. (2019); Seth (2014); Hobson et al. (2014); Hohwy (2018) is IIT's commitment to mathematical rigor and the potential to make quantitative predictions because of this. Epistemologically, the theory is grounded in a phenomenology-first approach Negro (2020), meaning that the theory starts with phenomenological axioms and, from these, deduces a mathematical measure.
The process of deducing a mathematical measure from the axioms of the theory has two key components: first, the phenomenological axioms must be translated into physical postulates and second, the physical postulates must be translated into welldefined mathematical objects. In both steps, there is a chance for the definition of Φ to lose specificity. The first arises because the map from phenomenological axioms to physical postulates may not be unique, meaning that the phenomenological axioms of the theory must be revised in order to provide necessary and 1 see e.g., citation tracking statistics for IIT and other theories using the SCOPUS database Burnham (2006) sufficient conditions that lead to a precise physical formulation. Similarly, it is possible that many different mathematical objects capture the physical processes that embody consciousness. In both cases, the problem is that the phenomenological axioms of the theory are not precise enough to constrain a detailed physical or mathematical description (or at minimum a class of mathematical forms that are shown to yield equivalent predictions). Conversely, the level of specificity required for mathematical rigor has not yet found rigid justification in terms of phenomenology: mathematical differences such as the choice of distance measure (Wasserstein distance or Kullback-Leibler divergence) do not easily translate into descriptions of the subjective differences they intend to capture.
Critiques of IIT focus on different aspects of this problem. For example, Barrett et al. point out that the mathematical definition of Φ is insufficiently constrained by the postulates of the theory Barrett andMediano (2019), while McQueen McQueen (2019) highlights an invalid inference procedure from axioms to postulates. In addition, there are many critiques of the axioms themselves Cerullo (2015); Mindt (2017); Bayne (2018), which demonstrate the difficulty in trying to specify phenomenological axioms for consciousness, even without the rigor required by a mathematical formalism. Here, we present a fundamentally different type of ambiguity. Rather than focusing on the translation from phenomenology to postulates or postulates to mathematics, we accept the axioms, postulates, and mathematical definition of the theory at face value and show that the Φ values that result are neither unique nor specific. The values are non-unique in the sense that a multitude of values result from the precise application of the theory, and the values are non-specific in the sense that the possible values are not confined to a small portion of the range. We demonstrate the extent of this problem by calculating all possible Φ values for a corpus of small systems, with the aim of providing context to existing results and emphasizing the importance of avoiding non-unique algorithms for IIT moving forward, particularly in efforts to test its predictions across various experimental systems.

The non-uniqueness problem
The observation that Φ is indeterminate for some systems has been known since as early as 2012 Tononi (2012). This indeterminacy arises as a consequence of what has occasionally been referred to as "underdetermined qualia" or "tied purviews" in the IIT literature Krohn and Ostwald (2017); Moon (2019). Yet, the extent to which this problem must be addressed in order to interpret already published Φ values remains largely unexamined. In Section "Mathematical Details", we provide a detailed mathematical description of the issues surrounding non-uniqueness. To summarize here, the challenge is that Φ ("big phi") is the result of a minimization routine that is not guaranteed to produce a unique value. One applies a scalar function ("little phi") to an ensemble of probability distributions (called "causes" and "effects") and selects the distribution with the smallest value. However, if multiple distributions in the ensemble have the same value, which is often the case, there is no prescription to decide which distribution to carry forward in the calculation. Since the final result is sensitive to this choice of distribution, a multitude of Φ values can result from the calculation.
Despite its importance, the problem of degenerate causes/effects remains under-addressed. Moon (2019) is a notable exception. To our knowledge, there are only four proposed solutions to this problem Oizumi et al. (2014); Krohn and Ostwald (2017); Moon (2019) and they differ drastically in content.
The first solution was put forth by Oizumi et al. (2014), where in Figure S1 of their Supporting Information, the authors suggest that the degenerate cause corresponding to the biggest purview element should be selected as the unique cause. Note that a purview element is simply a subset of the system for which a cause/effect is calculated (Section "Mathematical Details"). The justification for this argument rests on the assumption that a larger purview "specifies information about more system elements for the same value of irreducibility" Oizumi et al. (2014); in other words, the values of the degenerate causes/effects may be the same, but bigger purview elements constrain more of the system and should therefore be selected. However, as pointed out by Moon Moon (2019), this justification does not derive from the postulates of IIT and alternative solutions are equally consistent with the phenomenology. Furthermore, there is no guarantee that the largest purview element is unique, as "tied purview elements are often the same size". Thus, this cannot be a general resolution to the problem. Instead, it is a specific solution that yields a unique Φ value for the system under consideration by Oizumi et al. (2014). It is worth noting that this approach is adopted by Maynard et al. in the implementation of the Python package PyPhi Mayner et al. (2018), which is widely used to calculate Φ in practice. That said, the computational implementation of the algorithm also fails to address what to do in the case of tied purview elements of the same size and defaults to selecting the first of the elements under consideration.
The second proposed solution, introduced by Krohn and Ostwald (2017) (hereafter KO17), is the opposite of that proposed by Oizumi et al. Instead of selecting the largest purview element, KO17 argues that it is the smallest purview element that should be selected as the core cause/effect in the case of tied purviews. Their motivation stems from a central idea in IIT that "causes should not be multiplied beyond necessity" Tononi (2012). This point is crucial in justifying IIT's exclusion postulate but was not originally intended for application to the dimensionality of tied purview elements. More importantly, this proposed solution also does not address if the tied purview elements are the same size, which means that it does not guarantee a mathematically well-defined value of Φ.
To remedy this, KO17 presents a third solution based on a modified version of Φ that takes the sum of the degenerate values rather than arbitrarily selecting, or imposing a rule to select, just one. The benefit of this modified definition is that it always results in a single Φ value, but the downside is that it does not account for the shape of the probability distribution attached to the selected cause/effect, which is central to the standard interpretation of IIT. Without carrying the probability distribution forward in the calculation, there is no longer any notion of concepts in qualia space or quales in general Haun and Tononi (2019).
The last proposed solution is the "differences that make a difference" criterion put forth by Moon Moon (2019). Here, the author argues that if degenerate causes/effects exist, then none of the corresponding purview elements should be selected. This solution is based on the idea that in IIT to exist is to cause a difference "from the intrinsic perspective of the system", which is quantified by . If tied purview elements exist with the same value, then it is possible to eliminate one of the elements without changing the value and therefore that element does not exist from the intrinsic perspective of the system. For example, if the purview elements A p and B p both give rise to = 1/6, one can eliminate B p without changing the value from 1/6; therefore, the existence of B p does not make a meaningful difference according to . If one repeats this argument in turn for each of the degenerate causes/effects, they are forced to conclude that each fails to exist from the intrinsic perspective of the system and, consequently, none of the tied purview elements can be selected. While this approach is perhaps the most principled, it leads to Φ = 0 for systems that are clearly integrated. For example, if we apply the Moon definition of Φ to a fully connected AND+OR logic gate system, such as that shown in Figure 1(a), we find Φ = 0 despite the fact that both logic gates are integrating information from disparate inputs. Put simply, cutting the input to an AND gate does make a difference in terms of its ability to integrate information, and a measure of integrated information that does not account for this should be cause for questioning.
To summarize, challenges in the calculations of Φ and its uniqueness arise with all four existing solutions to the nonuniqueness problem. In the case of selecting the smallest or largest purview element, we are not guaranteed a unique mathematical solution due to the fact that tied purviews are often the same size. In addition, the justification as to why we should select either the smallest or largest element poses challenges when connecting back to the core phenomenology of IIT. Conversely, the modified definitions of Φ put forward by Moon and KO17 do produce unique values but violate expectations of what Φ is designed to measure. In the case of the KO17 definition of Φ, we throw away crucial information about the shapes of the underlying distributions, which underpin the notion of qualia in IIT. Similarly, the Moon definition is overly restrictive in that it often yields Φ = 0 for systems that are clearly integrating information, such as a fully connected AND gate.

Methodology
In this section, we identify and address the source of nonuniqueness in IIT 3.0. We assume that the reader is familiar with the basics of the theory, as presented in Oizumi et al. (2014), although for our purposes it will suffice to understand that the final Φ value is the result of a nested sequence of six different min/max operations, as shown in Algorithm 1. Each min/max optimization yields a different measure of integrated information, which means that there are six different measures to keep track of , , and Φ . Note that it is the outermost measure-Φ -that corresponds to the overall level of consciousness according to theory; however, this last optimization step is often neglected with the understanding that one is typically only interested in a specific subsystem, rather than the system as a whole. This is because considering the system as a whole is both computationally intractable (see Section "On the computational complexity of Φ") and philosophically challenging in all but the simplest cases Hoel et al. (2016). For these reasons, it is the Φ value, rather than the Φ value, that is typically published, as this is the end result of most calculations and the standard output from PyPhi. In practice, the superscript is usually dropped as a matter of convenience, leaving the reader to infer from context whether "Φ" refers to Φ , Φ , or just Φ. Fortunately, this difference between Φ values rarely makes a difference in terms of understanding, since the three measures are closely related conceptually. In our results, the degeneracy occurs well before both the Φ and Φ calculations, so both suffer from the same problem.
Specifically, it is the calculations of and in Lines 12 and 17 of Algorithm 1 that are the source of non-uniqueness in the routine for calculating Φ. In both cases, the problem amounts to an inability to uniquely specify the purview element associated with the minimum MIP value, as shown in Figure A2. In Section "Mathematical Details", we provide a detailed explanation of this problem, but the key feature is that both the minimum value and the probability distribution associated with the corresponding purview element are carried forward in the calculation. If it were just the scalar value that was carried forward, then there would be no meaningful difference between degenerate values. However, in Line 19 of Algorithm 1, we must calculate a distance between probability distributions, called cause-effect structures (CESs), and the shape of these distributions depends on which purview element is associated with the minimum value. In other words, the purview element with the minimum value determines the shape of the probability distributions going into the calculation of Φ, and these distributions can have different shapes with the same minimum values.
To address this problem, one needs to consider each of the tied purview elements in turn. The Python package PyPhi (available for download at http://integratedinformationtheory.org) provides all the basic functionality needed to do this. Namely, it allows the user to calculate purviews, cause/effect repertoires, Earth Mover's Distance (EMD) Pele and Werman (2008), CES distance (also known as the extended EMD), and more. In addition, it contains prebuilt classes for data structures corresponding to concepts that are useful in the calculation. Thus, our methodology is to wrap the basic functionality of PyPhi into a modified version, which we call PyPhi-Spectrum, that allows the user to calculate all possible Φ values for a given subsystem using the prebuilt PyPhi functions. Our algorithm follows the mathematical definition of IIT precisely, with the only difference to PyPhi being that whenever there is a minimization or maximization procedure, it looks for all minimizers or maximizers, forks the state of the computation accordingly, and carries on computations according to the mathematical definition for all forks (Section "The PyPhi-Spectrum package"). To install PyPhi Spectrum, download or clone the repository from https://github.com/elife-asu/pyphi-spectrum.
As a concrete example of our methodology, we consider the simple AND+OR system shown in Figure 1(a). Here, each cut generates a host of different Φ values due to the degeneracy in the optimization routine, as discussed in detail in Section "Mathematical Details". Feeding this system into PyPhi-Spectrum, we find that there are 83 valid Φ values, as opposed to the single Φ value output from PyPhi. Crucially, both Φ = 0 and Φ > 0 are predicted, meaning that there is no longer a clear prediction as to whether or not this system is conscious.

Results
In this section, we apply PyPhi-Spectrum to a variety of recently published systems in order to determine the extent to which nonuniqueness affects the interpretation of results in the published literature.

Case study: three-node fission yeast cell cycle
As a case study, we first consider the Boolean network model of the fission yeast cell cycle from Marshall et al. (2017) (one of our co-authors was also an author on this study). In the study by Marshall et al., IIT is used to analyze the causal structure of a minimal biological system, namely, the cell cycle of the fission yeast Schizosaccharomyces pombe. Using Φ values output from PyPhi, the authors identify three integrated subsystems corresponding to the full regulatory network (eight nodes), a six-node subsystem, and a three-node subsystem-all potentially of biological importance. Of these three systems, only the smallest may be subject to our analysis, although we expect similar results for the other two systems. Applying the methodology from Section "Methodology", we find a spectrum of 244 non-unique Φ values, spanning a range from 0.00 to 0.83 bits, as shown in Figure 2. Only one of these values (Φ = 0.09) is published as the unique Φ value for this subsystem. Furthermore, the inclusion of Φ = 0 in the spectrum of possibilities changes the biological narrative of the results entirely. If the subsystem under consideration has Φ = 0, rather than Φ > 0, it would not be identified as "integrated" and the connection between integration and the autonomy discussed in the manuscript cannot be made. Thus, the conclusion that this subsystem is of biological importance is entirely dependent on the arbitrary selection of a single Φ value from the spectrum of possibilities and, in general, it is impossible to tell a priori whether the interpretation of results from a singular Φ value is valid without considering the entire spectrum of possibilities.

The non-uniqueness of published Φ values
Next, we use PyPhi-Spectrum to analyze a corpus of recently published Φ values. This corpus is meant to be comprehensive but is not exhaustive, in particular due to the limitations imposed by the computational resources required to perform these calculations (Section "On the computational complexity of Φ").  Table 1, are selected primarily for their size, although size alone is not a good indicator of computational tractability. For example, certain three-node systems, such as that of Hanson and Walker (2021), have thousands of degenerate CESs, while other three-node systems, such as that of Farnsworth (2021), have just a few. It is not readily apparent what dictates the number of non-unique CESs that, in turn, govern tractability, although underlying symmetries undoubtedly play a role. Consequently, our corpus is characterized by small systems (2-4 nodes), which allow relatively fast evaluation via PyPhi-Spectrum 2 .
Our primary result is shown in Figure 3 and demonstrates the entire spectrum of possible Φ values for each system in our corpus relative to its published value. There are several things to note. First, the existence of a unique Φ value is rare: only the photodiode example has a spectrum consisting of a single value (the number of different Φ values is denoted by |Φ | in Figure 3). For the rest of the corpus, the spectra often consist of dozens if not hundreds of non-unique values, of which only one is published (denoted as an "x" in Figure 3). There are also multiple cases where the spectrum contains both Φ = 0 and Φ > 0 values, demonstrating Table 1. Summary of the corpus in reverse chronological order. Sources were selected based on the publication of a Φ value and computational tractability. Additional details required for analysis, such as transition probability matrices and initial states, are provided in Section "Additional details related to the calculation of Φ values" (Table A1-  . This has implications for the logical foundations of the theory and, in particular, for the exclusion postulate, as this postulate causes the degeneracy (Section "Conclusion"). Additionally, the span of possible Φ values is often comparable to the entire range of possibilities expected for systems of this size. Typical Φ spectra in Figure 3 span roughly half of a bit for two node Boolean systems, which are bound from above by Φ ≤ 1.5 and from below by Φ = 0 (c.f. Section "Calculating an upper bound on Φ"). This implies that the Φ values specified by IIT are not only non-unique but also non-specific in the sense that they do not constrain the Φ value to a small portion of the possible range.

Implementing existing solutions
In Section "The non-uniqueness problem", we discussed logical problems inherent with each of the four proposed solutions to the non-uniqueness problem. Even so, we can easily apply these solutions to our corpus using PyPhi-Spectrum 3 and study the results, shown in Figure 4. As expected, the KO17 and Moon solution yield Φ = 0 for several systems that are clearly integrated, such as the AND+OR system. Indeed, the fact that the Moon solution yields Φ = 0 for so many systems that were previously considered integrated suggests that Moon's solution deviates from the accepted notion of an integrated system in IIT. It is also clear that the "smallest" criterion does little to mitigate the degeneracy. At first One approach is to calculate the number of possible valuesdenoted | |-as a function of the system size n. We then compare this to the number of purview elements under consideration, which is always ! in order to determine if degenerate values are guaranteed by the pigeonhole principle Herstein (1991). In short, if there are more values than there are purview elements, then there must be degenerate purview elements. We introduce a measure of degeneracy = ! | | that can be used to track how many degenerate values are assigned to each purview element on average. The scaling of L as a function of n determines whether or not degeneracy plays a role in larger systems.
For example, let us say that probability distributions come in discretized units of 1/ , where n is the number of states in the system; note that this is the case for all deterministic transition probability matrices. The number of unique probability distributions that can be built subject to this constraint is equivalent to the number of ways we can assign n units of probability to n different states (the distribution's support). For each of the n units, there are n choices, so the number of unique probability distributions is at most n 2 . Since is a distance measure between two probability distributions, the number of unique values is at most | | = 2 × 2 . If we then compare this to the number of purview elements under consideration, we find = ! 4 , which goes to infinity as n goes to infinity. This implies that degeneracy gets worse, and not better, with increasing system size.

Conclusion
In addition to the mathematical problem of non-uniqueness, there is a philosophical problem as well. Namely, non-uniqueness arises as a direct consequence of the exclusion postulate in IIT, which is used to localize consciousness into singular experiences. Mechanistically, the way this is achieved is by suppressing the multitude of intermediate Φ values in favor of the extremum, which can then be uniquely mapped onto a single conscious percept. Nonuniqueness, then, implies a philosophical problem in which there is no clear way to map the multitude of Φ values onto a single conscious percept-either all the Φ values are valid, which results in simultaneous conscious experience, or an additional postulate is required to justify the choice between values. In the former case, it is a phenomenological axiom that is violated, while the latter case requires at least one additional postulate. In both cases, the revision goes well beyond a simple mathematical fix and points to problems with the so-called "hard core" of the theory Negro (2020); Lakatos (1968). It is for this reason that Moon emphatically argues "the underdetermination problem shakes IIT to the ground" Moon (2019).
The problem of non-uniqueness in IIT fits into a more general problem faced by phenomenology-first theories of consciousness. If we begin with a small set of axioms, it is virtually impossible for these axioms to uniquely constrain a precise mathematical measure. At each step in the operationalization, we must add details to the phenomenology in order to define the mathematics, which means that either the mathematical choices lack phenomenological grounding or the list of phenomenological axioms becomes increasingly contrived. For example, in IIT 2.0, the Kullback-Leibler divergence was used to calculate distances between probability distributions, whereas the EMD was used in IIT 3.0. Mathematically, these two measures are very different and will yield qualitatively different predictions for the same system. This implies that only one of the two measures can be correct, yet it is difficult to imagine how one can justify the choice between these two distance measures in terms of "what it is like" to be conscious Nagel (1974). If we are to take a phenomenology-first approach to consciousness, we must accept that there is likely not going to be a unique measure of consciousness that results from the phenomenology of the theory, but rather, a host of different measures that are consistent with the axioms of the theory. We must then rely on experiments, grounded by independent benchmarks, to falsify the different quantitative predictions.
To be clear, the non-uniqueness problem in IIT 3.0 is not a case of a precise mathematical measure that fails to find justification in terms of its phenomenological axioms, but rather, an ill-defined mathematical measure that demonstrably violates the hard core of the theory. That said, any proposed solution to the non-uniqueness problem must contend with the general lack of specificity provided by a phenomenology-first approach. Such an approach requires experimental evidence as a means to justify the specific choices that were made in translating phenomenological axioms into precise mathematics. There is a growing body of literature discussing how experiments may be used to assess different theories of consciousness Doerig et al. (2020) and how the mathematical structure of the theory, in general, must be consistent with what is known experimentally about conscious percepts Kleiner and Ludwig (2023). Assessing the impact of non-uniqueness, in all of its forms, is a crucial part of this research paradigm.

Mathematical details
To demonstrate the problems inherent in the mathematical definition of Φ, we will consider a simple system comprising an AND gate and an OR gate connected to each other, as shown in Figure 1. Since there are only two elements, we need not worry about the outermost optimization over subsystems as a two-element system can have only one meaningful subsystem ( i.e. Φ = Φ for two element systems). To calculate Φ , we first must initialize the system into a given state. Here, we assume an initial state 0 = 00, with the understanding that this specific choice of initial state does not impact the general result. The next step is to identify the CES or constellation C corresponding to the transition probability matrix (TPM) of the unpartitioned system. To do this, one must find the "core cause" and "core effect" of every potential mechanism in the system, where a mechanism is any element in the power set of the subsystem. In this case, the potential mechanisms are in ({ }) = { , , }, where the superscript c denotes the mechanism in its current state. For each element in this set, we must identify how well it constrains elements in the past power set ({ }), known as the past purview, as well as how well it constrains elements in the future powerset ({ }), known as the future purview.
We then measure the EMD Pele and Werman (2008) D between the constrained distribution of each purview element and the constrained distribution of each purview element under the "minimum information partition" (MIP): where z is the purview element and m is the mechanism. The distribution ( | = 0 ) tells us the likelihood of z given that the current state of m is s 0 , which, compared to an unconstrained distribution, tells us how much information m is generating about z. We also need to know whether or not that information is integrated, so we must break m and z up into all possible parts and ask whether the parts acting independently can generate the same amount of information as the whole. For example, to find how much integrated information is generated by the mechanism A c about the purview element = , we calculate the probability distribution ( | = 0) and compare this to the two possible partitions of the purview: / → ( / × []/ ) and / → ( / × []/ ). The first partition allows A c to constrain A p but leaves B p unconstrained (denoted by an empty bracket []), while the second partition allows A c to constrain B p but leaves A p unconstrained. The distributions generated by these partitions, shown in Figure A1, are then compared to the distribution generated by the unpartitioned system, and the partition that minimizes the EMD to the unpartitioned system is the MIP for this purview/mechanism combination. If multiple partitions yield the same distance to the unpartitioned system, as is the case in Figure A1, it is irrelevant which one is chosen as all that moves forward from this step of the computation is the scalar value of MIP .
Once we have identified the MIP (and calculated MIP ) for all purview elements for a given mechanism, we define the core cause and core effect as the past and future purview elements with the greatest MIP . We denote the integrated information of the former as and of the latter as and define the total integrated information Max of a given mechanism as = min [ , ]. Figure A1. All possible partitions of a given mechanism and purview combination and the resulting MIP value. Figure A2. All possible purview elements and their MIPs for a given mechanism. It is here that the degeneracy is introduced, as one cannot select a unique core cause or effect for a given mechanism if there are purview elements with the same MIP values.
If > 0 for a mechanism we say that the mechanism gives rise to a "concept". A concept is a tuple comprising of three things: a scalar max value, a probability distribution corresponding to the core cause repertoire, and the probability distribution corresponding to the core effect repertoire. We have already provided an example of a cause repertoire in Figure A1, namely, it is the distribution over previous states of the purview element given the current state of the mechanism. In Figure A1, the state of A c constrains the probability of observing AB p and this constrained distribution is the cause repertoire for the purview element AB p . Any element not included as part of the purview is left unconstrained and must be independently "noised" Oizumi et al. (2014). For example, if the purview of mechanism A c is A p , we generate the constrained distribution ( | = 0 ) and combine this with the unconstrained distribution for B p (denoted ( )). For the AND+OR system, we have ( | = 0) = [2/3, 1/3] and ( ) = [1/2, 1/2], which yields the cause repertoire: [1/3, 1/3, 1/6, 1/6], where states are ordered in binary with the most significant digit on the left (00, 01, 10, 11).
The effect repertoire for a given purview element is generated in the same way as a cause repertoire. For example, the probability of observing A f given the state of mechanism = 0 is ( | = 0) = [0, 1]. Combining this with the unconstrained future distribution for B f yields the effect repertoire [1/4, 3/4, 0, 0]. Note that this is an example where the unconstrained future distribution for B f is not uniform. This is a direct result of the noising procedure: an OR gate receiving uniform random input is three times as likely to be in State 1 as it is to be in State 0. Furthermore, when more than one target node is involved, we must send independent noise to each target (to avoid correlated input).
We can now define the CES (constellation) C as the set of all concepts for the system in the given state. Recall that each concept corresponds to a single mechanism and comprises the mechanism's core cause repertoire, core effect repertoire, and Max value. In our example, there are at most three concepts, corresponding to the mechanisms { , , }. The core cause repertoire for a mechanism is found by optimizing over all past purview elements and identifying the purview with the highest MIP , where MIP is found by further optimization over all possible partitions. Figure A2 shows MIP and the corresponding partition for all possible purview elements given the mechanism A c .

Degenerate core causes and effects
We are now in a position to see how degenerate core causes and effects can occur and their consequences. The postulates of IIT-and the exclusion postulate, in particular-imply that a unique core cause should be assigned to each mechanism, but the purview element that generates is not unique. As Figure A2 shows, A p , B p , and AB p all generate the same MIP value for the mechanism in question. Since each purview/mechanism combination is associated with a different cause repertoire, "the core cause repertoire and the resulting constellation C are not uniquely defined". If the scalar value of was all that mattered to the calculation of Φ, this degeneracy would be inconsequential (as is the case for partitions that generate the same MIP value for a given purview element). However, system-level integrated information Φ is defined as the cost of transforming the core cause/effect repertoires from one constellation C into another C ′ . That is, where D is an extension of the EMD that calculates the cost of moving Max "between repertoires" Oizumi et al. (2014). If the core cause or effect repertoire changes, the distance between constellations will change accordingly, as the distance metric that goes into the EMD calculation is sensitive to the relative shape of the distributions and not just the scalar Max values. For example, if we were to choose AB p as the core cause for mechanism A c , this generates the concept in C given by the tuple {[1/3, 1/3, 1/3, 0], [1/2, 1/2, 0, 0], 1/6}, where the first element is the core cause repertoire, the second element is the core effect repertoire, and the third element is the Max value. However, we could just as easily have chosen A p as our core cause and A f as our core effect. In which case, the concept generated for A c would be {[1/3, 1/3, 1/6, 1/6], [1/4, 3/4, 0, 0], 1/6}. Clearly, these choices have To illustrate the consequences of this, let C be the constellation consisting only of the concept generated by A c with core cause AB p and core effect AB f and let C ′ be the constellation consisting of only the null concept for this system (the unconstrained cause and effect repertoires): The extended EMD is the cost of transforming C into C ′ by moving = 1/6 a distance given by the sum of the (regular) EMD between cause repertoires and effect repertoires. Namely, we have Thus, we get different values of Φ depending on our choice of core cause and core effect. In total, 83 different Φ values result from this process, as shown in Figure 1(b). The last step is to calculate Φ , which is defined as the minimum Φ value over all possible cuts. For the AND+OR system, there are only two possible cuts; however, each cut has a host of possible Φ values. We must consider the pairwise combination of the Φ values from each cut in turn and ask what the minimum is in order to build up the set of possible Φ values. For example, if the Φ values for cut 1 are [0, 1, 2] and the Φ values for cut 2 are [1, 2, 3], then only Φ = 3 is excluded from being a valid Φ value. The reason for this is that all other Φ values are the minimum of some combination of Φ values from the cuts under consideration (e.g. when Φ = 3 in cut 2 and Φ = 2 in cut 1, then Φ = 2), but there is no situation for which Φ = 3 is the minimum across all cuts because the maximum Φ value from cut 2 is always less than Φ = 3. In general, the maximum Φ value is set by the cut with the smallest max Φ value, as evident in Figure 5(b).

On the computational complexity of Φ
For a subsystem of size n, the computational complexity scales as follows. First, one must calculate the CES for every possible partition of the subsystem. If the partition is a bipartition, as is typically assumed Mayner et al. (2018); Krohn and Ostwald (2017), the number of ways to do this is ( , 2), where n is the size of the subsystem and ( , ) are Stirling numbers of the second kind Stanley (2011). However, two small modifications must be applied: first, partitions are unidirectional, and second, the unpartitioned system must also be included. The former consideration results in twice as many bipartitions, while the latter results in a single additional partition. Combining these results in a total of 2 ( , 2) + 1 partitions, which, for large n, is well approximated as 2 ( , 2). For each CES, there are 2 − 1 potential mechanisms, corresponding to the size of the powerset of elements excluding the empty set.
For each potential mechanism, there are ( ) purview elements of size k, each of which can be partitioned ( , 2) times. Therefore, there are 2 ∑ ( ) ( , 2) = 2(3 ) elementary distance calculations that must be performed to calculate a single CES, where the additional factor of two is due to the need to optimize max over both past and future purviews. Putting this together, there are a total of 2 ( , 2 + 1) × (2 − 1) × 2(3 ) ≈ 12 elementary distance calculations required to get the system-level integrated information Φ for a given subsystem. For the global system, this calculation must be embedded in an additional optimization corresponding to maximizing over the powerset of all possible subsystems (i.e. calculations required to find Φ for a global system of size m. For all but the smallest m values, the computational resources required to actually calculate Φ are impossible to realize. Interestingly, the (13 ) scaling derived here is in tension with the previously published value of (53 ) Mayner et al. (2018). This could be due to the possibility that the (53 ) scaling considers all possible partitions, rather than strict bipartitions, or perhaps it resolves the elementary computation in terms of some more fundamental operation (e.g. bit flips). Without additional information on how (53 ) was derived, it is difficult to say whether either of these considerations resolve the tension between values. We do note, however, that the published values of t = 1, t = 16, and = 9900 for n = 3, n = 5, and n = 7 (Mayner et al. (2018)) are within an order of magnitude of the calculated (13 ) scaling but off by 40 orders of magnitude from (53 ) scaling.

Calculating an upper bound on Φ
It is relatively straightforward to calculate a loose upper bound on Φ for a subsystem of size n. To do so, one needs to only understand the extension of the earth mover's distance D that is used in the calculation Φ = ( || → ). By definition, the "earth" being moved is Max between concepts in the unpartitioned CES (C) and the partitioned CES ( → ), while the "distance" it is moved is measured by the regular earth mover's distance between concepts Oizumi et al. (2014). In light of this, a straightforward upper bound on Φ can be found by asking what the maximum value of Max is for each concept and moving that amount as far away as possible. For a mechanism of size m, its Max value is bound from above by the maximum value of the regular earth mover's distance, which is ( ) = . It is easy to see this is the case, as EMD Max is achieved when all the probability (P = 1) is moved the maximum Hamming distance (H Max ), which is m for a mechanism comprising m bits. For example, EMD Max for a threebit mechanism is achieved when P = 1 is moved from State 000 to State 111 ( = 3), so = = 3. Next, we must ask what the maximum distance D Max in conceptual space is that this amount of Max can be moved. Since this distance is again a regular earth mover's distance, we have = ( ) = . Thus, the maximum contribution a mechanism of size m can make to the extended earth mover's distance D is upper bound by ( ) ( ) = [ ( )] 2 = 2 . Of course, not all mechanisms are the same size, so the total contribution is bound by the sum of the maximum contribution from mechanisms of each size, namely, To date, this is the only known upper bound on Φ that we are aware of (although bounds on max and in IIT 2.0 are readily available Krohn and Ostwald (2017);Oizumi et al. (2016); Arsiwalla and Verschure (2016); Tegmark (2016); Toker and Sommer (2016)), and it is a rather loose bound. For a subsystem of size n = 2, as is the case for the AND+OR system, we consider in the main text that we have Φ(2) ≤ (1) 2 + (1) 2 + (2) 2 = 6 bits. In practice, we cannot reasonably expect = for all mechanisms, as the existence of = for one mechanism almost certainly precludes the existence of = for another. Likewise, cutting a CES cannot possibly result in a distance of = for all concepts, as additional noise cannot be used to increase the fidelity of constraints. At best, it is likely that concepts map to the null concept in the CES of the MIP, corresponding to a maximum distance ( ) = /2. In this case, the bound that results is Φ ≤ 2 −3 ( + 1), which is still likely loose. To tighten it, one must consider the Max values that can result for a system of mechanisms as an ensemble, rather than individually, which is a task that we found quickly became intractable.

Numerical approach
For our purposes, a numerical approach will suffice. Given a small enough system, it is possible to calculate the Φ values for every possible TPM that result from Boolean logic on a two-bit system. Namely, each bit (A and B) takes one of two possible states in response to the global state of the system. This means that there are 2 4 = 16 possible state transitions for each coordinate, for a total of 16 2 unique TPMs. For each, it is possible to calculate the Φ spectrum that results using the algorithm we describe in the main text. Then, the upper bound on Φ is simply the maximum Φ value over all possible TPMs in all possible initial states. Performing this exercise results in the bound Φ ≤ 1.5, which is exactly one-fourth the analytical bound derived in the previous section; as discussed, it is likely that a factor of 1/2 is accounted for if ( ) = /2, while the other factor of 1/2 may be accounted for by the same type of argument applied to Max (rather than D Max ). If so, the upper bound on Φ would be 2 −4 ( + 1) and is potentially more tractable to calculate than previously believed.

The PyPhi-Spectrum package Additional details related to the calculation of Φ values
In this section, we provide the transition probability matrices and initial states necessary to replicate our results (Tables A1-A11). The same data can be found in downloadable form via the GitHub repository: https://github.com/elife-asu/pyphi-spectrum.   1000  1100  0100  1100  1100  1110  0010  1101  1010  1101  0110  1101  1110  1111  0001  0001  1001  0000  0101  0000  1101  0010  0011  0001  1011  0001  0111  0001  1111 0011 A photodiode is a simple system of two interacting COPY gates, taking input from one another. It is arguably the simplest "integrated" system one can study and has been studied in the context of IIT at least twice Chalmers and McQueen (2014); Oizumi et al. (2014), we set the initial state of the system to be 0 = 10. The TPM is given below.

AND+OR Hanson and Walker (2019); Albantakis et al. (2019)
Like the photodiode, the AND+OR system has been studied in the context of IIT at least twice prior to the current work Hanson and Walker (2019); Albantakis et al. (2019). However, a concrete Φ value is yet to be published. Therefore, we take the "published value" to be that of the PyPhi value found in Section "Methodology". Similarly, we take the initial state to be 0 = 00 in accordance with Section "Methodology". The transition probability matrix is given below.

Hanson and Walker (2021)
This system is a three-bit digital counter in the initial state 101. The initial state is selected somewhat arbitrarily, since any initial state will work, but 0 = 101 results in a particularly fast evaluation. The TPM, from Figure 4 of the original publication, is as follows:

Majority gate system
This system comprises three interconnected majority gates, each with three inputs, as shown in Figure 5. If the majority of inputs to a given node are 0, the state of the node at the next time step is 0, and if the majority of inputs to a given node are 1, the state of the node at the next time step is 1. In the main text, the system is evaluated in initial state 0 = 000. The transition probability matrix is provided below.

Gomez et al. (2021)
This papers studies the p53-Mdm2 biological regulatory network. Typically, this network is multivalued, but there are two possible binarizations that make standard Φ calculations possible. Of  these, we chose the Fauré and Kaji binarization as it is much faster to analyze than the Tonello binarization. Following the authors, we choose an initial state 0 = 0001 and use the following TPM. Note that the PyPhi value we compute for this TPM differs from that published by the authors due to their use of several non-standard configuration settings, such as Krohn and Ostwalds definition of Φ as a difference in integrated conceptual information rather than the IIT 3.0 definition.

Farnsworth (2021)
In this paper, a virocell (virus-infected cell) is introduced into a Boolean network model of host cell dynamics. There are two network models provided, the first consists of five nodes and is the "full system", while the second consists of three nodes and is the "reduced system". For both systems, we study the case where all the nodes are "ON" (i.e. 0 = 11111 and 0 = 111, respectively). Following the Supplementary Material provided by Farnsworth, the transition probabilities matrices are given later. Note that, in the full system, the second node is an AND gate (as shown in Figure 6 of their main paper Farnsworth (2021)) rather than a COPY gate as shown in Figure 8 of their Supplementary Material.

Oizumi et al. (2014)
This is the canonical OR+AND+XOR system that is often used in demonstrating how to calculate Φ Tononi (2015)

Tononi et al. (2016)
This paper demonstrates the calculation of Φ for a simple system of four interacting logic gates: MAJORITY+OR+AND+AND. Following the authors, we use the initial state 0 = 1110. The TPM is given below.

Hoel et al. (2016)
This paper examines several small Boolean networks at both micro-and macro-scales. We choose to analyze the smallest of microsystems here, which is a system of four interconnected AND gates with noisy input. Following the authors, we analyze the system in initial state 0 = 0000. Due to the noisy input, the TPM is not deterministic and therefore cannot be written as an N by 2 matrix. Instead, it must be written as an N by N matrix where entry (i, j) specifies the probability of state i transitioning to state j at time step t + 1 (a standard TPM). The TPM is given below.

Marshall et al. (2017)
This model system is the fission yeast cell cycle from Marshall et al. (2017). As mentioned in the main text, we study the three node subsystem rather than the full eight-node subsystem (plus one external node) studied in the original publication. To calculate the spectrum of Φ values for this subsystem (or just the PyPhi Φ value), the TPM for the entire system (all nine nodes) is required. Therefore, there are 512 states in the TPM. Following the authors, the initial state of the system is set to 0 = 000110011. In little-end binary notation (most significant bit on the right), the TPM used is as follows: